We developed a low-cost, high-performance gesture recognition system with a dynamic hand gesture recognition technique based on the Transformer model combined with MediaPipe. The technique accurately extracts hand gesture key points. The system was designed with eight primary gestures: swipe up, swipe down, swipe left, swipe right, thumbs up, OK, click, and enlarge. These gestures serve as alternatives to mouse and keyboard operations, simplifying human–computer interaction interfaces to meet the needs of media system control and presentation switching. The experiment results demonstrated that training deep learning models using the Transformer achieved over 99% accuracy, effectively enhancing recognition performance.
Loading....